Session 8: Geospatial Data Visualization in Python

Prof. Rebecca A. Johnson

2024-09-23

Roadmap

  • Static plots with geopandas
    • Two types of background layers: street maps + census data
    • Spatial joins/intersections
  • Interactive plotting with folium

Geopandas

Two basic data structures:

  • GeoSeries: a single series that holds either spatial points, spatial lines, or spatial polygons
  • GeoDataFrame: one column of the dataframe is a GeoSeries column; similar to R, this is named the geometry column

Importing spatial data

Similar to R, can import geojson and other file formats

homeless_facilities = gpd.read_file("Homeless_Service_Facilities.geojson")
homeless_facilities.head()
   OBJECTID           PROGRAM_NAME  ... EDITED                    geometry
0         1  Shepherd Park Library  ...   None  POINT (-77.02704 38.98029)
1         2         Takoma Library  ...   None  POINT (-77.02016 38.97444)
2         3           Headquarters  ...   None  POINT (-77.02706 38.96692)
3         4    Chevy Chase Library  ...   None  POINT (-77.07546 38.96558)
4         5   Lamond-Riggs Library  ...   None  POINT (-76.99959 38.95514)

[5 rows x 69 columns]
type(homeless_facilities)
<class 'geopandas.geodataframe.GeoDataFrame'>
homeless_facilities.geometry
0      POINT (-77.02704 38.98029)
1      POINT (-77.02016 38.97444)
2      POINT (-77.02706 38.96692)
3      POINT (-77.07546 38.96558)
4      POINT (-76.99959 38.95514)
                  ...            
110    POINT (-76.98065 38.84928)
111    POINT (-76.98125 38.84619)
112     POINT (-76.97423 38.8436)
113    POINT (-77.00937 38.83133)
114    POINT (-77.00815 38.82999)
Name: geometry, Length: 115, dtype: geometry

Obtaining background layers

  • For street-map type layers, contextily package is one resource
  • For Census tract/county/state layers, can either obtain shapefiles from Census API directly or use pygris, similar to R’s tigris

Approach one: Street map layers

  • First create the plot (defaults to matplotlib)
  • Then, add the basemap as the background
points_plot = homeless_facilities.plot(color = "green", markersize = 0.5)
cx.add_basemap(points_plot, crs = homeless_facilities.crs)

Approach one: Street map layers

Approach two: census data as background

  • Unlike R, no clear equivalent to tidycensus that can pull both geometries + information about those geometries
  • Two options:
    • Obtain shapefiles directly from Census TIGER site: link here
    • Use pygris package (similar to tigris in R)

Working directly with shapefiles

dc_tracts = gpd.read_file(filename =
"https://www2.census.gov/geo/tiger/TIGER2020/TRACT/tl_2020_11_tract.zip")

dc_tracts.head()
  STATEFP  ...                                           geometry
0      11  ...  POLYGON ((-77.05018 38.92124, -77.05006 38.921...
1      11  ...  POLYGON ((-77.0463 38.91631, -77.0463 38.91637...
2      11  ...  POLYGON ((-77.03241 38.92657, -77.03217 38.926...
3      11  ...  POLYGON ((-77.04166 38.91418, -77.04166 38.914...
4      11  ...  POLYGON ((-77.04599 38.9145, -77.04574 38.9146...

[5 rows x 13 columns]
type(dc_tracts)
<class 'geopandas.geodataframe.GeoDataFrame'>

Plotting polygons

fig, ax = plt.subplots()
dc_tracts.plot(edgecolor = "white", alpha = 0.5, 
              ax = ax)
ax.set_axis_off()
fig.patch.set_visible(False)

Adding points to a polygon plot

Make sure to align the crs!

homeless_tractcrs = homeless_facilities.to_crs(dc_tracts.crs)
fig, ax = plt.subplots()
dc_tracts.plot(edgecolor = "black", alpha = 0.1, 
              ax = ax)
homeless_tractcrs.plot(ax = ax,
              marker = "o", color = "red", markersize = 0.5)
ax.set_axis_off()
plt.title("Homeless facilities in DC")
fig.patch.set_visible(False)
plt.show()

Adding points to a polygon plot

Pulling demographic information

Multiple options:

  • Can pull directly from the Census API
  • Can use the get_census() function within pygris

get_census() for pulling demographic info.

dc_poverty_info = get_census(
  dataset = "acs/acs5/profile",
  variables = "DP03_0119PE",
  year = 2020,
  params = {"for": "tract:*",
          "in": "state:11"},
  guess_dtypes = True, 
  return_geoid = True)

dc_poverty_info.head(3)
   DP03_0119PE        GEOID
0         23.9  11001007703
1         26.6  11001007707
2         29.7  11001007708

Merging with polygon information

See that similar to sf in R, geopandas helps maintain it as a spatial dataframe post merge

dc_tracts_wpov = dc_tracts.merge(dc_poverty_info,
              how = "inner", 
              on = "GEOID")
type(dc_tracts_wpov)
<class 'geopandas.geodataframe.GeoDataFrame'>

Plotting the result

See here for a list of colormap (cmap) options: link here

fig, ax = plt.subplots()
dc_tracts_wpov.plot(edgecolor = "black", alpha = 0.9, 
              ax = ax,
              column = "DP03_0119PE",
              legend = True,
              cmap = "Blues")
homeless_tractcrs.plot(ax = ax,
              marker = "o", color = "red", markersize = 0.5)
ax.set_axis_off()
plt.title("Homeless facilities in DC vs. % of tract living in poverty",
color = "white")
fig.patch.set_visible(False)
plt.show()

Plotting the result

Where we are

  • Static plots with geopandas
    • Two types of background layers: street maps + census data
    • Spatial joins/intersections
  • Interactive plotting with folium

Spatial intersection

  • Previous example was merging a spatial dataframe with a non-spatial attribute using a GEOID column
  • How can we do a spatial intersection where we find the count of homeless facilities within each tract?
  • Unlike intersects and intersection in R, which check all elements against each other, the equivalent operations in python only check pairs of rows
  • If want to avoid iteration, can use the sjoin command

Step one: perform the spatial intersection/join

tracts_w_facilities = gpd.sjoin(dc_tracts_wpov,
                homeless_tractcrs,
                predicate = "contains")

dc_tracts_wpov.shape[0]
206
homeless_tractcrs.shape[0]
115
len(tracts_w_facilities.GEOID.unique())
76
tracts_w_facilities.shape[0]
115

Step two: aggregate by tract

## group the dataframe to find the count in tracts with 1+ overlap
count_pertract_1plus = tracts_w_facilities.groupby('GEOID').size().\
                      reset_index() 
count_pertract_1plus.columns = ["GEOID", "count_facilities"]

## create a similar dataframe with the tracts that had
## no intersections with a facility (so count = 0)
nonjoined_tracts = set(dc_tracts_wpov.GEOID).\
        difference(tracts_w_facilities.GEOID)
count_pertract_zero = pd.DataFrame({'GEOID': 
            list(nonjoined_tracts),
        'count_facilities': 0}) 
        
count_pertract_all = pd.concat([count_pertract_1plus,
                    count_pertract_zero])

assert count_pertract_all.shape[0] == \
          len(count_pertract_all.GEOID.unique())
          
## merge back using GEOID
dc_tracts_wcount = dc_tracts.merge(count_pertract_all,
        how = "inner", on = "GEOID") 
        

Step three: plot the results with a categorical shading

fig, ax = plt.subplots()
dc_tracts_wcount.plot(edgecolor = "black", alpha = 0.9, 
              ax = ax,
              column = "count_facilities",
              categorical = True,
              legend = True,
              cmap = "Blues")
ax.set_axis_off()
plt.title("Count of homeless facilities per DC census tract", 
color = "white")
fig.patch.set_visible(False)
plt.show()

Step three: plot the results with a categorical shading

Where we are

  • Static plots with geopandas
    • Two types of background layers: street maps + census data
    • Spatial joins/intersections
  • Interactive plotting with folium

Folium for interactive plotting

  • Wrapper for leaflet which we covered in R
  • Similar capabilities including zoom and hoverable/clickable markers and shapes

Step one: create the background map

Default is open street map

m = folium.Map(location = [38.9047, -77.016], 
    zoom_start = 12)
m
Make this Notebook Trusted to load map: File -> Trust Notebook

Example one: add markers with hoverable + clickable information

  • Tooltips: pull up information as you hover over a marker/point
  • Popups: pull up information as you click on a marker/point - can also add the ability to zoom in on that point
  • The geoJson set of commands within folium can accept either geojson or geopandas dataframes for plotting

Example one: add markers with hoverable + clickable information

Use add_to() to add to the basemap

homeless_folium = homeless_facilities[['PROGRAM_NAME',
'WEBSITE_URL', 'geometry', 'TARGET']].to_crs(4326).copy() 
m = folium.Map(location = [38.9047, -77.016], 
    zoom_start = 12)
folium.GeoJson(homeless_folium,
    name = "Homeless facilities",
    marker = folium.Circle(radius = 5,
            fill_color = "green",
            fill_opacity = 0.5,
            color = "green"),
    tooltip = folium.GeoJsonTooltip(fields = ["PROGRAM_NAME",
    "WEBSITE_URL", "TARGET"],
    aliases = ["Name:", "URL:", "Target group:"]),
    popup = folium.GeoJsonPopup(fields = ["PROGRAM_NAME",
    "WEBSITE_URL", "TARGET"],
    aliases = ["Name:", "URL:", "Target group:"]),
    zoom_on_click = True).add_to(m)

Example one: add markers with hoverable + clickable information

<folium.features.GeoJson object at 0x284d4f280>
Make this Notebook Trusted to load map: File -> Trust Notebook

Example two: add information to a choropleth map

Step 1: start with a choropleth map

dc_tracts_folium = dc_tracts_wpov.to_crs(4326).copy()
m = folium.Map(location = [38.9047, -77.016], 
    zoom_start = 11)
    
folium.Choropleth(geo_data = dc_tracts_folium,
    name = "Household poverty",
    data = dc_tracts_folium,
    columns = ["GEOID", "DP03_0119PE"],
    key_on = "feature.properties.GEOID",
    fill_color = "Blues").add_to(m)

Example two: add information to a choropleth map

<folium.features.Choropleth object at 0x284b256a0>
Make this Notebook Trusted to load map: File -> Trust Notebook

Example two: add information to a choropleth map

Step 2: add hoverable information

Need to use the geojson attribute from the choropleth map

dc_tracts_folium = dc_tracts_wpov.to_crs(4326).copy()
m = folium.Map(location = [38.9047, -77.016], 
    zoom_start = 11)
    
cp = folium.Choropleth(geo_data = dc_tracts_folium,
    name = "Household poverty",
    data = dc_tracts_folium,
    columns = ["GEOID", "DP03_0119PE"],
    key_on = "feature.properties.GEOID",
    fill_color = "Blues").add_to(m)
    
cp.geojson.add_child(folium.features.GeoJsonTooltip(['DP03_0119PE'],
aliases = ['Poverty rate:']))

Example two: add information to a choropleth map

<folium.features.GeoJson object at 0x284aa7bb0>
Make this Notebook Trusted to load map: File -> Trust Notebook

Wrapping up

  • Static plots with geopandas
    • Two types of background layers: street maps + census data
    • Spatial joins/intersections
  • Interactive plotting with folium